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Abstract — We provide a model to understand how adverse 
weather conditions modify traffic flow dynamic. We first prove 
that the microscopic Free Flow Speed of the vehicles is changed 
and then provide a rule to model this change. For this, we con- 
sider a thresholded linear model, corresponding to an application 
of a MARS model 1 1 1 to road trafficking. This model adapts itself 
locally to the whole road network and provides accurate unbiased 
forecasted speed using live or short term forecasted weather data 
information. 

Index Terms — Forecasting method. Linear thresholded model, 
Road traffic, Spatial extrapolation, Weather. 



I. Introduction 

IT is commonly accepted that adverse weather conditions 
modify significantly traffic flow dynamics in a complex 
way. Actually, it is well known that bad weather conditions 
such as, heavy rain, fog, snow, induce a significant decrease 
on traffic flow speeds. Note that it can be partially explained 
by the legal speed regulations. However, if several studies 
conclude that road traffic speed decreases during adverse 
weather, this trend is only confirmed. Furthermore, up to 
our knowledge, no quantitative analysis has been conducted 
to forecast the evolution of the observed speed of vehicles. 
Even deterministic models for road traffic fail in this study 
since they usually involve a large set of parameters which 
are all affected by the change of weather conditions. This 
prevents the use of such equations. In this paper, we 
tackle this issue and provide a general model to estimate 
the change in traffic flow speed for different weather scenarios. 

Actually, trying to understand the impact of weather 
conditions requires a direct comparison between observed 
traffic speeds whose variability is only due to such changes. 
Hence, quantifying the impact of adverse weather conditions 
on traffic speed can only be conducted through the analysis 
of a two paired data. Each pair corresponds to similar 
traffic conditions but with different weather conditions. 
This is a difficult issue since road trafficking is a non 
stationary phenomenon and much of the variability is due 
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to the road condition changes (with the occurrence of traffic 
jams). So, using empirical studies is not obvious because 
of the heterogeneity of the data. Moreover, some conditions 
are scarcely observed. This increases the difficulty of the 
estimation and requires to choose a set of weather conditions 
that makes sense for the road manager but also with a large 
enough number of observations. 

Some work has already been conducted in this direction. 
We refer to |2|,|3|,|4|,|5|,|6| and Q. Nevertheless, the 
drawback of previous methods is that the speed modifications 
are global. This means that the speeds are affected without 
enabling these changes to depend on the values of the initial 
velocity. In this work, we use a more flexible model using 
Multivariate Adaptive Regression Splines (MARS) model to 
consider thid initial condition. Furthermore, using a threshold 
enables to consider different impacts according to different 
levels of weather changes. Such procedure has been described 
in |8| and its implementation is detailed in [1]. Some work 
was conducted in the same direction, see for instance 191, 
|10| and references therein. The calibration of the model is 
achieved by estimating the parameters on a learning set. This 
last set is built by selecting pairs of observations under the 
same traffic condition but with a different weather condition. 
We study the performance of this model and prove that it 
enables to forecast accurately the possible speed evolutions. 

The paper falls into the following parts. Section[ll]is devoted 
to the description of the data and their particularity. The 



following section. Section III describes the construction of 



the model while its performances are analyzed in Section IV 



Finally, we discuss some of our results and draw some 
guidelines for speed forecast under adverse weather conditions 
in Section |V] 

II. Data and Issues 

A. Description 

A road network can be represented as a set of links 
connected together in a form depending on the underlying 
road network. Usually, links are classified by a well-known 
attribute: the Functional Road Class (FRC) UJj. FRC is a 
classification based on the importance of the road in the 
connectivity of the total network. Table |l] recaps the relation 
between the FRC and the network. In our study, industrial 
constraints make us work only on roads from FRC to 
2 because the most part of the information provided by 
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Mediamobile that matches the customer demand concerns 
this FRC range. 



TABLE I 

Full name and attribute values by FRC values 



FRC 


Full name and Attribute values 





Motorway, Freeway, or other Major road 


1 


Major road less important than a motorway 


2 


Other major road 


3 


Secondary road 


4 


Local connecting road 


5 


Local road of high importance 


6 


Local road 


7 


Local road of minor importance 


8 


Other road 



Vehicles equipped with some GPS device can return at 
each time their positions to a server. So that, they may 
be considered as floating sensors on the network. Such 
sensors form the Floating Car Data (PCD) source of traffic 
information. A map-matching algorithm (out of the scope of 
this article) establishes speed V{x, t) on a link a; at a time t 
from a couple of successive positions and times by matching 
them on a digitized network. Then, when a significant 
number of speeds on a link x are at hand, we may produce 
a microscopic processed PCD speed V{x,t). By using this 
gathering technology, we are not geographically dependent 
of any counting station but we are limited by the GPS users 
feedback. Nevertheless, this source of traffic data is relevant 
since we have a huge amount of data (515 798 606 positions 
in March 2011 for instance) and we potentially cover all the 
network. The traffic database used in the paper was provided 
by Mediamobile and is composed of the vehicle speeds over 
T = 107 days. 

The weather database used in this paper was also provided 
by Mediamobile through his partnership with Meteo-Prance. 
Since 2009, Meteo-Prance provides to Mediamobile a new 
high quality flow of data. It incorporates geo-tagged point 
every unit of 120 000 road kilometers of the Prance network. 
This flow of meteorological data is an aggregation of real 
time and forecasted observations. Then for our study, we 
have at hand a regular flow of weather data Ai{x,t) on a 
link x at a time t updated every 15 minutes. In this paper, 
we will focus on the following bad weather conditions: soft 
rain, medium and strong rain, rain and snow mixed, drizzle. 



B. Data quality 

Individual microscopic traffic data usually exhibit high noise 
and outliers due to several causes 

• GPS logger accuracy, 

• incorrect vehicle path estimation: a wrong projection of 
the vehicle path can lead to incorrect speeds and further, 
speeds on incorrect links, 

• inner variations of individual vehicles in the traffic flow. 
To decrease the noise and eliminate outliers, we use a two 

step filtering algorithm: 



1) the first step filters out aberrant data. Mediamobile 
estimates a Pree Plow Speed (PPS) defined as the most 
likely speed in free flow traffic conditions. Then, we 
filter out aberrant data where speed records are higher 
than 150% of the reference speed i.e. the PPS, 

2) the second step put out links that do not have enough 
records. Here, we arbitrarily fix to 100 measures the 
minimum amount of data necessary to keep a link. 

After performing this algorithm, we have confident traffic 
data. Weather data are already consistent because they have 
been preprocessed by Meteo-Prance. So that, they do not need 
any more treatment. 

C. Issues 

The main issues are twofold: 

• building a learning set for weather condition. Actually, 
the main goal is to build couples of speeds at a given 
location observed under the same traffic condition but 
with a different weather condition in order to understand 
the weather conditions consequences on road trafficking 
behaviors. Pirst, we need to associate both traffic data and 
weather data. The frequency of weather data flow is 15 
minutes. So even if a weather condition is observed at a 
time to only, we will propagate it to the whole interval 
[to, to + 15[ such that all speeds V(x,t) observed in this 
interval get paired with A4{x,to), 

• finding a predictive rule to forecast velocities. We used 
the heuristic idea that adverse weather conditions do not 
affect the velocity in the same way. Indeed, we build 
a model that includes a different treatment for different 
ranges of speeds. This rule must be stable to be extended 
to the whole road network, yet providing good enough 
estimations. 

III. Modeling 

A. Regression 

Our aim is to rectify the forecasted speed according to road 
weather conditions. Many methods exist to forecast speeds on 
a road network but this feature is out of the scope of this 
article. Some examples of such methods can be find in lfT2ll 
and fT3\. In this work, we already have forecasted speed at 
hand and we want to correct them according to road weather 
conditions. This will be done by applying a correction on the 
forecasted speed. We focus on a bias depending on the speed. 
Indeed, the expertise of road trafficking theory shows that 
under adverse road weather conditions, drivers reduce their 
velocities when they go fast whereas they do not change it 
otherwise. 

An usual way to rectify the speed is obtained by using a 
polynomial function of degree 1 

V<eV(io), V{x,t) = 

P 

j2 Po. [V{x, to)) X {y{x, to)) ^^'^ 

k=l 
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with Ok = [ak]bk], Pe^ ^ polynomial of degree 1 such 
that Pgi^(v{x,tQ)^ = ak-V{x,to) + bk and level of speeds 

\/k E [i--p], [ak',oik+i] are such that ai — and 
ttp+i = +00. V{x,to) is the forecasted speed at time 
to and V{x, t) its corrected speed. V(to) is a short term 
neighborhood around t^. This means that we will adjust 
V{x,to) in a time neighborhood while no newer speed data 
is available. 

This model is no more than the so-called Multivariate 
Adaptive Regression Splines (MARS) introduced in 1 1 1 where 
non linearities match driver's behavior changes. We refer to 
this model \M.i\ as the MARS model. Indeed, we can rewrite 
the model \M.i\ in a classical MARS form as 



p 

Wt e V{to), V{x, t)^Yl [hik, h2k,h3k) 

k=l 

with Tk = [ilk; 72k; 73k], Qrt, a polynomial of degree 1 
such that Qrk{hik,h2k,h3k^ = 7o + lik-hik + lik-h2k + 
Jik-hsk , a set of hinge functions 




max(0, V{x,tQ) - ak) 
max(0, V{x, to) - ak+i) 
max(0,Q;fe+i - V{x,to)) 



2) build a time neighborhood 77(^0) around to in such a 
way that speeds in this neighborhood are stationary. In 
practice, we used to fix arbitrarily rj{to) — [to — ^1,^0] 
with h — 5 minutes. This is generally narrow enough to 
warrant stationarity, 

3) let t* be the time of the latest observed speed before 
the climate change. Finding at least one observation in 
ri{to) for all to is quite unlikely with our traffic data 
source. In fact, FCD are observed at random times so it 
is obvious that t* may not exist. The main consequence 
is a very sparse dataset. Thus, we decide to relax a 
little bit the assumption of traffic stationarity by allowing 
ourselves to pick similar velocities even if they belong to 
different observation days. This means that we assume 
the existence of a stationarity between days. The only 
criterion that matters to pair the data is their belonging to 
the same temporal neighborhood 77 (to)- This ingenious 
consideration does not destroy more the stationarity than 
the climate change does itself, 

4) associate V{x,t*) to V{x,to) where Mix.t*) ^ 
A4{x,to)- For instance, if we want to study the impact 
of rain on microscopic speeds, V{x,t*) corresponds to 
speeds observed in no rain and no snow weather condi- 
tions (i.e. M{x,t*) = NONE) and V{x,to) correspond 
to speeds observed in the rain (i.e. Ai{x, to) — RAIN). 



and level of speeds Vfc G 

ai — and a^+i = +00. 



[ak;ak+i] such that 



We point out in Section IV-B that the model A4i suffers 
from a lack of stability on the network. Indeed, it has not an 
homogeneous structure among the network. This means that 
we are not able to extrapolate the model to all links. Moreover 
this model may apply a correction on low speeds under adverse 
weather condition. Hence, it does not fit the important feature: 
driving at low speed under bad weather conditions has no 
impact on drivers' behavior. That is why, we rather use the 



following variant of model Aii that fits better the traffic flow 
theory and that is easier to extrapolate 



VtGV(to), V{x,t) = 
V{x,to) X l[o.^»o [(v(x,to) 
h-V(x,to)+9o) X 1,^ 



(M2) 



,+00 



Vix,to) 



We refer to model |A^2| as the linear thresholded model 
since bad weather conditions have no consequence on road 
traffic flow under the critical speed 6*0/ (1 — 6*1). Beyond this 
value, vehicles are decreasing their speeds linearly. 



B. An association between speeds 

We wish to build a matching between speeds of vehicles 
at a given location observed under the same traffic condition 
but with a different weather condition. So that, we use the 
following scheme (illustrated in Fig. [T]|: 

1) extract occurrence times of climate change to. 



O iO Q/^ 

6 9 



ri(to) 



"1 I r 



Time 



Fig. 1. Matching between V{x,t*) and V{x,to): the solid black line at 
to corresponds to an occurrence of rain and V{x, tg), the ringed star, is the 
corresponding speed; white points correspond to points included in r]{to) = 
[to — h, to] and V{x, t*), the black point, corresponds to the latest speed in 



IV. Results 

We have introduced in Section IIII-AI two models that are 
able to represent the impact of bad weather conditions on 
microscopic speeds. In this section, we first aim to select the 
best link by link model using a statistical approach. This gives 
a solution to a local modeling of our problem. Nevertheless, 
we also need to find a way to turn our link by link models 
into a network-wide model. This task is done right after the 
residual study for model selection. 
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The following results have been established with data pre- 
sented in Section [n] matching the region of Toulouse in France 
during 107 days from November I''* 2009 to February 15*^ 
2010. We also focus on the impact of the rain because it is 
the most common adverse weather condition in France. So we 
basically work with 256 832 observations on 2070 hnks. 

A. Error 



We highlighted two models: the MARS model Mi and the 



linear thresholded model A^2 Here, we are facing a classical 
problem in statistics. We have to choose only one of these 
two models. What is the best model? How can we compare 
models in order to select the best one? Such questions are 
classical in model selection area and a solution can be set up. 
We sample our data in two parts: a learning sample containing 
90% of observations to estimate our models and a test sample 
containing the remaining 10% to validate and select the right 
model. 

For each link of the network, we estimate both models 
by minimizing the Root Mean Squared Error (RMSE). The 
quality of a model is established by a classical goodness of fit 
value over all links calculated with the test sample 



RMSE 



E 



1 



i=l 



We obtain a RMSE of 10 359.8 for the MARS model and 
11795.27 for the Hneai- thresholded model. So the MARS 
model fits the data better than the linear thresholded one but 
the difference of 1435.47 (13.8%) is not significant since it has 
been calculated on 552 links. Moreover, the MARS model has 
a more complex structure than the linear thresholded one be- 
cause the second is nothing more than a particular constrained 
MARS model. So it fits the data better by construction. 
Nevertheless, RMSE's on each hnk are similar for both models 
as shown in Fig. |2] This means that although we conclude 
that MARS model is better, the linear thresholded model is 
not so far behind and have the undeniable advantage of being 
extrapolatable to all a network whereas MARS models cannot 
because they have not a homogeneous structure among the 
network. This will be detailed in the next section. 

B. Stability and extrapolation 

We aim to set up a simple method to apply a correction 
to microscopic speeds based on adverse weather conditions 
on all a complex network at a time rather than on each 
link of a network taken separately. In fact, the France road 
network from FRC to 3 is composed of 1 740 462 links and 
obviously there is not the same number of different behaviors 
in response of adverse weather conditions. Thus, our will of 
finding a global method is justified. In this section, we discuss 
about how we can generalized link dependent models to a 
global network model. Interests of extrapolating our models 
match our industrial constraints: 

• to sum up and simplify our link by link models, 
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Fig. 2. Biplot of RMSE by links for MARS model and linear thresholded 
model 



• to apply a correction on speeds under adverse weather 
conditions on uncovered links. 

First, we focus on the MARS model. For each link of the 
network, we have estimated a MARS model. We quickly face 
a problem to extrapolate this kind of model. Although this 
kind of model fits well the data, the structure is complex and 
different among links. 

For instance. Fig. [3] shows three theoretical MARS models 
on three links. We observe that the number of slopes and 
the number of points to model non-linearities may differ 
from one link to another, making a global structure for a 
network difficult to build. Moreover, this kind of model is not 
consistent with road trafficking theory. As a matter of fact, 
under adverse road weather conditions drivers may reduce 
their velocities for high level speeds while their behavior is 
unaffected for low ones which cannot be always the case in 
general. 
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Fig. 3. Example of three MARS models on three links 

To sum up, link by link MARS models cannot be 
extrapolated to all a network mostly because they have not 
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an homogeneous structure by link. 

Second, we deal with linear thresholded models. Until 
now, our linear thresholded models were local because we 
built one model per link. With this kind of model, we always 
get a pair of parameters for each link which means that 
the structure is homogeneous among the network. So it is 
possible to extrapolate link by link models. Fig. |4] shows a 
2-dimensional kernel density estimation of the distribution of 
the two parameters for all these models. Two modes appear: 
one is associated to low 6'o's { Oq < 55 km/h) and another to 
high ones (6*0 > 55 km/h). In fact. Fig. |5] shows exactly that 
marginal distribution for 9q is highly related to the FRC. 
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correctly. Indeed, FFS is highly correlated with the FRC 
(see Fig. [6| and this normalization appears to be the most 
relevant and significant (5% significant F-tests has been done). 
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Fig. 6. Boxplots of Free Flow Speeds by FRC 

Fig. I?] points out that the joint distribution {9o,6i) has 
been concentrated around one mode. We have thus stabilized 
the parameters over all FRC and thus over all the network. 



Fig. 4. Global distribution of (Soi^i) 
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Fig. 5. Biplot of (00, 6»i) for all links with their FRC 

We remind that we wish to build a global modeling for 
a network in other words our goal is to construct a model 
able to be applied on all the network. That is why we 
need to catch this dependence between and the FRC. A 
way to do that is to find a normalization u such that the 
marginal distribution for 6*0 = u{6q,x) does not depend of 
the FRC. The use of u{-,x) = •/FFS(a;) will normalize 6*0 
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Parameters of the global model remain as the empirical 
mean of the parameters of the link by link models i.e. 
{do,9i) = {9o,9^). One could naturally expect that (610, 6*1) 
is more than FRC dependent but also depend on the climate 
zone. Comparing the RMSE of the global model on climate 
zones with the RMSE of models aggregated by climate zone, 
we concluded that the global model was better in each case 
which means that the discrimination by climate zone is not 
significant. 



Finally, our two link by link models Aii and A^2 have 



similar RMSE on our data. Nevertheless, the MARS model can 
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not be extrapolated by construction and it does not respect road 
trafficking theory. Thu,s the correct model for our problem is 
the linear thresholded model [AT^I because it fits the data as the 
MARS model but it also respects road trafficking theory and it 
can be easily extrapolated to the whole network. Constraining 
the MARS model makes it homogeneous among all the links 
and consequently extrapolable. The form of the global linear 
thresholded model for all the network remains 



VteV(io), V{x,t) = 

1-^1 

+ (9~i.V{x,to) + 9o.FFS{x)) x 1 



9o.-FFS(x) 



+ 00 



It is interesting to rewrite the model in a interpre table 

form for road trafficking experts. Let a = — ^ and 

1— 61 

/3 = 1 — 6*1, we obtain 
If Vix.to) > a.FFS{x), 

V{x, t) = V{x, to) - /?. (V{x, to) - a.FFSix)') (M^) 
Vix,t) ^ V{x,to) 



Else 



Thus the interpretation is really natural and illustrated in 
Fig. |8] If there is a vehicle at a speed above a proportion 
of the Free Flow Speed a.FFS{x) and it starts raining, 
we decrease its speed by j3.{v{x,to) — a.FFS{x)^ i.e. a 
proportion of the difference between its initial speed and the 
proportion of the Free Flow Speed . a.FFS{x) represents 
the speed at which adverse weather conditions start impacting 
speeds. 



Let us consider a basic example to practice with the model: 
a vehicle is recorded at 130 km/h on a freeway where the 
Free Flow Speed is 130 km/h and it starts raining. So since 
130 > 0.66 X 130/(1 - 0.16), we apply a correction and the 
speed is reduced to 130 - (016 x 130 + 0.66 x 130) = 106.6 
km/h. Remark that it respects the French road speed limit on 
a freeway (130 km/h in general and decreasing to 110 km/h 
when raining). 

Until now, we have selected the linear thresholded model 
and extrapolated it on a network. We know that our extrap- 
olation by normalizing 9o by the Free Flow Speed was the 
best relevant according to our result but we also need to 
measure the actual loss of quality in extrapolating the model. 
So we calculate and compare these two following quantities 
still calculated on the test sample 



RMSE^= J2 \ 



RMSE^^ J2 



{links) \ ^ i=l 



1 ^ 

Y.{y[x,t),^-V{x,t),Y 



— Y.^yM^^-y{^Ai) 



with y(a;, t)i|^2 the forecasted speed with a link by link 

linear thresholded model and V (x^t)irj;y^ the forecasted 
speed with a linear thresholded model for the network i.e. 
link by link models extrapolated to all the network. 

We obtain RMSE^m^ = 11795.27 and 
RMSE^^^ = 12 511.37. The loss of quality associated 
to the globalization ARMSE is equal to 6.07%. Thus we 
can warrant that the extrapolation of our link by link models 
is relevant and do not destroy the quality of fit compared to 
our local models. 



V. Conclusion 



in 
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-a.FFS(x)) 


no effect 
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Fig. 8. Percentage of speed reduction based on |A^4| 

We have built a global model which adapts itself locally 
since we use this ingenious norrn alization 9o = 9o/ FFS{x). 



With our data, estimates of 



{9o,9i) = (0.66,0.16). 



We have tackled the issue of building a generic rule 
able to predict the evolution of vehicle velocities when the 
weather condition changes. For this, we have considered a 
version of Multivariate Adaptive Regression Spline model, 
well calibrated to get stable and accurate predictions. So we 



get model Aii that basically does not modify speeds under 
a proportion of the free flow speed. Above this threshold 
value, the model decreases the speed by a proportion of the 
difference between this speed and the threshold value. Thus, 
one can use the model |A^4| to correct forecasted speeds with 
the information of weather conditions. To learn the model 
over a data set, we had to construct a well adapted learning 
set. One of the difficulties of this task was to overcome the 
non stationarity of the observed data which mix both the 
variability due to the changes of the weather conditions and 
the one due to the changes in the traffic conditions. This was 
achieved by considering time neighborhood around similar 
velocities. Moreover the desired stability of the decision rule 
enables us to extend this method over a whole road network. 
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This is, to our knowledge, the first global quantitative analysis 
of the impact of adverse weather on the observed vehicle 
velocities. This is a major improvement to forecast travel 
time with some knowledge on the weather conditions. This 
contributes to a better quality of the forecasts done by 
Mediamobile company. 
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Moreover, this study is also a key to better understand the 
macroscopic impact of adverse weather conditions on the Free 
Flow Speed. Actually, some fancy results emerge with the 
extrapolation of the linear thresholded model. We could think 
that adverse weather conditions not only impact microscopic 
speeds but also impact the well known Free Flow Speed. The 



global model A^3 includes such a result. When building this 
model, we build a speed at which adverse weather conditions 
start impacting speeds and this speed is nothing more than a 
proportion of the Free Flow Speed a.FFS{x). In this work, 
we provide another reference speed which can be considered 
as the Free Flow Speed under a certain adverse weather 
condition. 
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